Phrase-based Memory-based Machine Translation
نویسنده
چکیده
This master thesis aims to investigate a phrase-based approach of Memory-based Machine Translation. This is a form of automatic translation powered by lazy-learning classifiers to translate fragments of the input sentence. A parallel corpus serves as the basis for training such a classifier. In the phrase-based approach the principal component of these fragments is a phrase of arbitrary length. This can be contrasted to prior research in the field in which this component was a single word. A key element in the research is a comparison of three methods of phrase extraction. A new decoder has been developed to deal with the characteristics unique to this approach, and re-assemble the translated fragments into one final translation. This research will show that one of the proposed phrase-extraction methods is capable of outperforming previous word-based approaches, even though this gain is limited and the impact of phrases proves to be smaller than anticipated.
منابع مشابه
Translating Phrases in Neural Machine Translation
Phrases play an important role in natural language understanding and machine translation (Sag et al., 2002; Villavicencio et al., 2005). However, it is difficult to integrate them into current neural machine translation (NMT) which reads and generates sentences word by word. In this work, we propose a method to translate phrases in NMT by integrating a phrase memory storing target phrases from ...
متن کاملA Comparative Study of English-Persian Translation of Neural Google Translation
Many studies abroad have focused on neural machine translation and almost all concluded that this method was much closer to humanistic translation than machine translation. Therefore, this paper aimed at investigating whether neural machine translation was more acceptable in English-Persian translation in comparison with machine translation. Hence, two types of text were chosen to be translated...
متن کاملSet-Phrase Machine Translation Based on Multilingual Dictionaries
The paper focuses on the issues of automatic compiling of the set-phrase dictionaries for machine translation systems. The methods employed are based on translation memory acquisition principles and heuristic language processing tools. Machine learning techniques are used for extraction of new rules and templates.
متن کاملIntegrating Translation Memory into Phrase-Based Machine Translation during Decoding
Since statistical machine translation (SMT) and translation memory (TM) complement each other in matched and unmatched regions, integrated models are proposed in this paper to incorporate TM information into phrase-based SMT. Unlike previous multi-stage pipeline approaches, which directly merge TM result into the final output, the proposed models refer to the corresponding TM information associ...
متن کاملExtending Memory-Based Machine Translation to Phrases
We present a phrase-based extension to memory-based machine translation. This form of examplebased machine translation employs lazy-learning classifiers to translate fragments of the source sentence to fragments of the target sentence. Source-side fragments consist of variable-length phrases in a local context of neighboring words, translated by the classifier to a target-language phrase. We co...
متن کامل